Quality of child anthropometric data from SISVAN, Brazil, 2008-2017

ABSTRACT OBJECTIVE To evaluate the quality of anthropometric data of children recorded in the Food and Nutrition Surveillance System (SISVAN) from 2008 to 2017. METHOD Descriptive study on the quality of anthropometric data of children under five years of age admitted in primary care services of the Unified Health System, from the individual databases of SISVAN. Data quality was annually assessed using the indicators: coverage, completeness, sex ratio, age distribution, weight and height digit preference, implausible z-score values, standard deviation, and normality of z-scores. RESULTS In total, 73,745,023 records and 29,852,480 children were identified. Coverage increased from 17.7% in 2008 to 45.4% in 2017. Completeness of birth date, weight, and height corresponded to almost 100% in all years. The sex ratio was balanced and approximately similar to the expected ratio, ranging from 0.8 to 1. The age distribution revealed higher percentages of registrations from the ages of two to four years until mid-2015. A preference for terminal digits “zero” and “five” was identified among weight and height records. The percentages of implausible z-scores exceeded 1% for all anthropometric indices, with values decreasing from 2014 onwards. A high dispersion of z-scores, including standard deviations between 1.2 and 1.6, was identified mainly in the indices including height and in the records of children under two years of age and residents in the North, Northeast, and Midwest regions. The distribution of z-scores was symmetric for all indices and platykurtic for height/age and weight/age. CONCLUSIONS The quality of SISVAN anthropometric data for children under five years of age has improved substantially between 2008 and 2017. Some indicators require attention, particularly for height measurements, whose quality was lower especially among groups more vulnerable to nutritional problems.


INTRODUCTION
Anthropometry is universally used for nutritional surveillance of population groups 1,2 .Anthropometric data are periodically collected to provide a clear understanding of the magnitude and distribution of nutritional problems in a country, as well as to design and monitor interventions to improve the nutritional status of the population [3][4][5] .The availability of accurate prevalence estimates of stunting, wasting, overweight and obesity in children is essential to monitor local, national and global progress towards the goals of eradicating hunger and all forms of malnutrition 6 .
In Brazil, the monitoring of nutritional status is part of Food and Nutrition Surveillance (FNS), provided by the law that created the Brazilian Unified Health System (SUS), and consists of the continuous description of food and nutrition conditions of the Brazilian population 7 .The collection, recording and analysis of anthropometric data are routinely performed through population surveys by health professionals in primary care services, aimed at planning and organizing nutritional care and attention in SUS 7 .For anthropometric data to generate reliable information about the nutritional and health status of the local population, it is necessary to follow quality standards in the collection, recording and analysis of such data.
The quality of anthropometric data can be affected by multiple factors including sampling strategy, team training, measurement techniques and tools, non-response rate, data entry and processing methods [8][9][10] .Thus, several indicators have been proposed and used to assess the quality of such data, including population coverage 11,12 , completeness of birth date and anthropometric measurements 13,14 , preference for digits of age, height, and weight 15,16 , percentage of biologically implausible values 17 , as well as dispersion and distribution of standardized measures of weight and height 10,18 .
These indicators have been widely used to verify and control the quality of anthropometric data in population surveys and research, such as demographic and health surveys, in which they are used to account for the variability in data quality among different sites over time 10 .However, the application of these indicators has still been incipient to assess the quality of data routinely collected in health services.In Brazil, coverage indicators have been solely used to assess the quality of anthropometric data of the population assisted in SUS health services 11,12 .
Aiming to expand this approach, the objective of this study was to evaluate the quality of anthropometric data of children under five years of age recorded in the Food and Nutrition Surveillance System (SISVAN), a tool of the Ministry of Health to monitor the nutritional status of Brazilians served in Primary Health Care (PHC).This study covers the evaluation of multiple quality indicators, recommended by the Technical Expert Advisory group on nutrition Monitoring (TEAM) of the World Health Organization (WHO) and the United Nations Children's Fund (UNICEF) 19 .This study is expected to guide those involved in FNS actions on how to improve the quality of anthropometric data in order to provide greater reliability to the metrics for local, state and national monitoring of the nutritional status of the Brazilian children population.

METHOD Design of the Study
This is a descriptive study on the evaluation of the quality of anthropometric data of children aged 0 to 59 months, attended in PHC services in Brazil, in the period between 2008 and 2017.The information was obtained from the individual and anonymized SISVAN databases.
The raw data from SISVAN, made available for use in this project by the Center for Data and Knowledge Integration for Health (CIDACS), Oswaldo Cruz Foundation (Fiocruz), were used in accordance with institutional protocols for data security and privacy and as established by Resolution 466/2012 of the National Research Ethics Committee of the National Health Council.The project was submitted and approved by the ethics committee of the Institute of Collective Health of the Federal University of Bahia (CAAE: 41695415.0.0000.5030).

Data Source
The SISVAN databases are composed of nutritional and food monitoring records from the PHC Health Information Systems.Regarding nutritional status, these databases include anthropometric data recorded in the Bolsa Familia Program (BFP) Management System, in e-SUS APS, and in SISVAN 20 .Data on the nutritional status monitoring of BFP beneficiaries, which occurs at least twice a year, are incorporated into SISVAN at the end of each BFP term (first term from January to June and second term from July to December).The records from the e-SUS APS are gradually incorporated into the Health Information System for Primary Care, respecting the schedule of data submission by health teams, and then exported to SISVAN after processing and validation of data 20 .

Quality Indicators for Anthropometric Data
The quality of anthropometric data was assessed by means of multiple indicators, recommended by WHO-Unicef 19 : 1) completeness (coverage of the target population, completeness of date of birth, and completeness of anthropometric measurements); 2) sex ratio; 3) age distribution (histograms of age in years/months and month of birth; and index of dissimilarity for age in months); 4) digit preference of height and weight (histograms of terminal digits and whole numbers; and index of dissimilarity for terminal digits); 5) implausible z-score values (percentage of implausible z-scores); 6) standard deviation of z-scores; and 7) normality of z-scores (histograms, skewness, and kurtosis).To analyze the indicators, data on sex, date of birth, age (months and years), height (cm) and weight (kg) measurements were used, as well as the z-scores of the anthropometric indices commonly used to assess the nutritional status of children: Height/Age (H/A), Weight/Age (W/A), Weight/Height (W/H), and Body Mass Index/Age (BMI/A).Z-scores were calculated using the "STATA igrowup package" tool and the WHO child growth reference curves 21 .The estimated quality indicators are described in detail in Chart 1.All records with complete birth month and year.

Index of dissimilarity for age (in months)
The Myers "unblended" index was analyzed, according to the formula below, to identify the percentage of records deviating from a uniform distribution of age in months (0-59 in which Xis is the observed percentage and Xie is the expected percentage.The index ranges from 0 to 90, being 0 the optimum value.
All records with complete birth month and year.
Preference for height and weight digits Histograms of terminal digits and whole numbers for height (cm) and weight (kg) Histograms were used to assess the pattern of distribution of terminal digits and whole numbers of height and weight.An approximately uniform distribution is expected.
All records with weight and/or height information.
Index of dissimilarity for terminal digits of height (cm) and weight (kg) The Myers "unblended" index was analyzed, according to the formula below, to identify the percentage of records deviating from a uniform distribution for terminal digits of height and weight 2 in which X is is the observed percentage and X ie is the expected percentage.The index ranges from 0 to 90, being 0 the optimum value.
All records with weight and/or height information.
All records with date of birth, weight and/or height information.

Standard deviation of plausible z-score values for each anthropometric index
Standard deviation was calculated using the following formula in which Y ˉ = mean, s = standard deviation (calculated with 'n' in the denominator instead of n-1) and n = sample size.It is generally accepted that a coefficient <-0.5 or >0.5 indicates skewness All records with biologically plausible z-score values for the anthropometric index of interest.

Kurtosis of plausible z-score values for each anthropometric index
Kurtosis was calculated by the Fisher-Pearson coefficient: in which Y ˉ = mean, s = standard deviation (calculated with 'n' in the denominator instead of n-1) and n = sample size.In general, it is accepted that a coefficient <2 or >4 indicates kurtosis.1).The coverage of the total (resident) population increased from 14.3% to 34.6% in that same period.The increase in coverage was observed in all regions, with emphasis on the Northeast, where it was the highest in all years and the only region to reach more than 50% of the target population.Among the federative units, the highest coverage was observed in Minas Gerais (71.6% in 2017), Paraíba (69.7%),Piauí (69.4%),Ceará (58.9%), and Maranhão (58.9%).In 2017, only 11 federative units had coverage greater than 50% (Table 2).

Completeness of Date of Birth and Anthropometric Measurements
The percentage of records with complete birth date in SISVAN was high for the entire period studied, ranging from 99.9% in 2008 to 100% in 2017 (Table 1).The percentage of records with completed weight and height measurements was also high, showing 100% in almost all years (Table 1).

Sex Ratio
The sex ratio ranged between 0.8 and 1 over the years.Variability in this indicator was identified among 2012 and 2014, when ratios showed higher numbers of girls compared to boys (Table 1).

Age Distribution
Histograms of age in completed years revealed a pattern of higher percentages of registrations among ages two to four years until mid-2015, when the age distribution became more uniform (Figure 1a).Such a pattern was observed mainly in the North and Northeast regions.Birth months showed approximately uniform distribution regardless of year, sex, and region (Figure 1b).According to the index of dissimilarity, the percentage of records to be redistributed to obtain a uniform distribution of age in months reduced from 20.4% in 2008 to 3.8% in 2017 (Table 1).

Digit Preference for Height and Weight
Histograms in Figure 2a show almost 100% of preferences for the terminal digit zero for height, and terminal digits zero and five for weight.Almost 90% of height records (largest possible number) would need to be redistributed to obtain an uniform distribution of terminal digits across all years; while for terminal digits of weight, the percentage of records that would need to be redistributed reduced from 50.2% in 2008 to 40.4% in 2017 (Table 1).The distribution of whole numbers showed several noticeable peaks for height measurement (e.g., 100 cm) (Figure 2a), while the distribution of whole numbers for weight was very adequate, showing no peaks (Figure 2b).

Implausible Z-score Values
The percentages of implausible z-scores according to WHO cutoff points varied until mid-2014, when a decrease was observed in the following years for all anthropometric indices (Table 1): H/A (3.5% in 2014 vs. 1.9% in 2017), W/H (3.2% vs. 2.2%), W/A (2.4% vs. 0.8%), and BMI/A (4.8% vs. 2.9%).The highest percentages of implausibility were identified among records for children under two years old and from the North, Northeast, and Midwest regions (Figure 3).

Dispersion of Z-score Values
Standard deviation values greater than 1 for plausible z-score measures were found for all anthropometric indices throughout the period (Table 1).The lowest standard deviation values

Normality of Z-score Values
The distribution curves of z-scores in 2008 and 2017 showed flattening and leftward deviation for H/A, and more modest flattening and rightward deviations were observed in the distributions of z-scores of W/H, W/A, and BMI/I, compared to the normal distribution pattern of WHO child growth curves (Figure 5).According to Fisher-Pearson coefficient values, the distribution of z-scores was symmetric for all indices, whereas the distributions of H/A and W/A z-scores were platykurtic in all years, which is explained by coefficients above 4 and flattening of the curve (Table 1).

DISCUSSION
This study examined the quality of anthropometric data of children seen in PHC between 2008 and 2017.It is the first one assessing the quality of individual SISVAN data in Brazil, covering multiple indicators and different dimensions.Overall, the results show that the quality of anthropometric data collected and recorded in PHC information systems has improved substantially over the years.These findings represent a milestone for the consolidation of SISVAN, allowing better clarity and reliability in the use and analysis of data to identify nutritional problems in the population and for decision making in food and nutrition policies.
Completeness is one of the dimensions of data quality that is directly related to selection biases and, consequently, to the representativeness of the results.High completeness was observed throughout the period in date of birth and anthropometric measurements, which are mandatory information for registration.The results also showed a growing expansion of SISVAN coverage over the years, especially among the SUS user population.This expansion can be attributed to important advances: the successful articulation of FNS actions with other education and social assistance policies (for example, the School Health Program and the BFP 7 ); expansion of the coverage of community health agents and family health teams11; expansion and qualification of PHC, through the Family Health Support Teams and the Primary Health Care Access and Quality Improvement Program (PMAQ-AB)7; and investment in FNS actions through the Financing for Food and Nutrition Actions (FNA) and financial support for municipalities to purchase appropriate anthropometric equipment for primary care 22 .
Despite the clear progress over the years, the coverage of SISVAN still remains incipient in most regions and federative units of the country.In 2017, only 11 out of the 27 federative units had coverage above 50%.Also, nine of the 11 federative units are from the Northeast region, which concentrates the largest number of BFP beneficiaries 23 .Previous studies show that about 80% of anthropometric records incorporated by SISVAN, between 2008 and 2013, came from the BFP conditionalities 11 , which include nutritional monitoring of children under seven years of age, reflecting in higher coverage of SISVAN for the most socioeconomically vulnerable Brazilian population.It is worth noting that, due to the Covid-19 pandemic and its effects on health services, a substantial drop in BFP conditionalities monitoring was detected in 2020, since of the 7.3 million beneficiary children, only two million had follow-up records, which corresponds to coverage of only 25.5% 24 .
Another important aspect related to the external validity and representativeness of the data is the distribution of the population according to sex and age.The sex ratio found in SISVAN was balanced and approximately similar to that expected for the Brazilian population under five years of age (1.05) 25 .The age distribution in complete years and months did not show substantial peaks.However, a pattern of higher percentages of records between the ages of two to four years was observed until mid-2015, when the age distribution became more uniform.This pattern was observed mainly in the North and Northeast regions.As these regions concentrate the largest number of records in The preference for height and weight digits may signal from the rounding of measurements to the use of inadequate equipment and care during data collection and recording.
A preference for the terminal digits zero and five was observed in measurements of height and weight, indicating systematic error by the rounding of measurements.Among whole numbers, several noticeable peaks for height measurements were observed (e.g., 100 cm), revealing possible problems with equipment or rounding of measurements.On the other hand, the distribution of whole numbers for weight was very adequate.
Although critical to obtaining accurate prevalence of nutritional status, these results are relatively common and expected, especially for height measurements 8 .In most anthropometric rulers, centimeter marks are larger and easier to read than millimeter marks, inducing less diligent or less knowledgeable staff to record rounded values.
As observed in the SISVAN data, rounding of weight measurements was less common, possibly due to the use of digital scales whose displays provide numerical values with easy-to-read decimals.
It is also worth mentioning the adequacy of structures and equipment for collecting these data in basic health units.According to a recent study, based on data from the external evaluation of PMAQ-AB in 2014, only 35% of primary health units in Brazil had an adequate structure for the development of FNS, including adult and child scales, anthropometric ruler, measuring tape, and child health booklet 26 .
The implausibility, dispersion, and normality indicators of z-scores are usually associated with measurement errors, inaccurate date of birth, or errors in data recording.Despite the reduction of implausible values among SISVAN records from 2014, the percentages still exceeded 1% for all indices, suggesting low data quality according to the WHO implausibility system 2,19 .A similar result was found for the z-score dispersion indicator.Although values of standard deviation remained stable over the years, a large dispersion of z-scores was noted for most indices.Previous studies have reported wide variation in the standard deviation of anthropometric z-scores in children under five years of age in demographic and health surveys in several low-and middle-income countries 10,18 .
Consistent patterns of higher percentages of implausibility and dispersion of z-scores were observed among anthropometric indices including height measurement, and records of children under two years of age and residents of the North, Northeast, and Midwest regions.Such results point to well-known errors and limitations on the collection and recording of anthropometric measurements.It is generally expected that the standard deviation of H/A and BMI/A is higher than that of the other indices, due to the greater difficulty and chance of errors in collecting height and age measurements.
This pattern is especially expected in the group of children younger than two years of age, whose height is measured with them lying down and the accuracy of age in months is more critical due to the faster growth rate in this age group (19,27) .Furthermore, it is noteworthy that regions where the standard deviation and percentage of implausible values were higher are the most vulnerable from the point of view of adequate structure for nutritional surveillance in primary health units, according to a study with data from PMAQ-AB 26 .
Different parameters and normality measures were used to assess the distribution of z-scores for each anthropometric index.From the Kernel density plots, we observed distribution deviations to the left for H/A, and to the right for W/H, W/A and BMI/A, as compared to the normal distribution pattern of the WHO growth curves.Although the z-score distributions were symmetric for the four indices, kurtic distributions were identified for H/A and W/A (Fisher-Pearson coefficient > 4).
Despite these findings, there is still no consistent evidence to suggest that the dispersion and deviation from a Gaussian distribution is due to data quality alone.The WHO reference population used to derive the z-scores was restricted to a healthy population living in favorable environmental conditions for healthy growth 28 .Thus, it is possible that unusual distributions may occur in more heterogeneous populations, such as in countries with large social inequalities.The SISVAN population represents the users of PHC in Brazil, composed mostly of BFP beneficiaries; i.e., a more socioeconomically vulnerable population.
The distributions found in this study are consistent with estimates of malnutrition in this population, which reveal persistent prevalence of short stature and increasing burden of overweight and obesity in children 29,30 .
This study has some limitations.Interpretation of certain indicators alone may not be sufficient to draw conclusions on the quality of the data, especially for indicators that take into account the dispersion and distribution of z-scores.More research is needed to quantify in definitive terms how much of the distribution of z-scores is attributable to population heterogeneity or measurement error.In the absence of cut-off points or more appropriate approaches that consider such limitations, a joint assessment of quality indicators is recommended 19 .
Based on the results of this study, we highlight the importance of actions to improve critical points identified in the quality of anthropometric data from SISVAN: 1) maintenance and expansion of intersectoral policies and health programs that promote FNS actions, as occur in the BFP, School Health Program and Growing Healthy Program; 2) development of qualification and continuing education actions (preferably attentive and flexible to the different local realities); 3) maintenance and expansion of financial support to municipalities for structuring FNS in PHC, through the acquisition and periodic calibration of anthropometric equipment; 4) computerization in PHC services, allowing professionals in primary health units to promptly record data on care, including weight and height data, in the patient' s electronic medical record in e-SUS APS; and 5) implantation and implementation of routine for continuous verification and production of reports on the quality of data in SISVAN.

CONCLUSION
Overall, the results suggest that the quality of anthropometric data in SISVAN has substantially improved over the years.However, some indicators still require attention.The coverage of the target population remains incipient for a surveillance system whose objective is the universal monitoring of the public that uses SUS primary care.The accuracy and quality of anthropometric measurements, especially of height, were lower in records of children under two years of age and residents in North, Northeast, and Midwest regions.Such groups are the portion of the child population most vulnerable to nutritional problems, requiring accurate estimates that can support the monitoring of the population nutritional profile and the development of public policies.

1 inChart 1 .
which n is the total number of observations, Yi is each value in the database, and Y ˉ is the mean of observations All records with biologically plausible z-score values for the anthropometric index of interest.Continue https://doi.org/10.11606/s1518-8787.2023057004655Description of the quality indicators of anthropometric data.Children 0-59 months of age.SISVAN, Brazil, 2008-2017.Continuation Normality of z-score values Distribution of plausible z-score values for each anthropometric index Kernel density plots were used to examine the distribution pattern of the z-score values for each index.All records with biologically plausible z-score values for the anthropometric index of interest.Skewness of plausible z-score values for each anthropometric index Skewness was calculated by the Fisher-Pearson coefficient: were observed in 2008.A stability in the dispersion of z-scores for all indices was noticed after 2008: H/A (1.6), W/H (1.5), W/A (1.3) and BMI/A (1.5).In general, there was greater dispersion of z-score values among children two years of age or younger and residents in the Northeast, North, and Midwest regions (Figure4).

Table 1 .
Quality indicators of anthropometric data.Children 0-59 months of age.SISVAN, Brazil, 2008-2017.Data were analyzed using Stata software version 15.1 (Stata Corporation, College Station, USA).Since the same record can be entered in the different information systems that comprise SISVAN, duplicate records were identified and removed considering the following variables: identification code of the individual, date of birth, date of follow-up, weight, and height.Duplicate records were accepted when all values of considered variables were equal.Quality indicators were annually described and according to the following variables when applicable: sex, age, region, and federative unit.In total, 73,745,023 records and 29,852,480 children under five years of age were identified in the SISVAN nutritional status databases between 2008 and 2017, after excluding 27,167,791 duplicate records.The coverage of the target SUS-using population increased from 17.7% in 2008 to 45.4% in 2017 (Table SISVAN: Food and Nutrition Surveillance System; SUS: Unified Health System; IBGE: Brazilian Institute of Geography and Statistics; ANS: National Supplementary Health Agency; H/A: Height/Age; W/H: Weight/Height; W/A: Weight/Age; BMI/A: Body Mass Index/Age.https://doi.org/10.11606/s1518-8787.2023057004655SISVAN: Food and Nutrition Surveillance System; H/A: Height/Age; W/H: Weight/Height; W/A: Weight/Age; BMI/A: Body Mass Index/Age.https://doi.org/10.11606/s1518-8787.2023057004655Data Processing and Analysis

Table 2 .
Coverage of the target population and total population, by region and federative unit.Children 0-59 months of age.SISVAN, Brazil.
SISVAN: Food and Nutrition Surveillance System.SUS: Unified Health System.
11SVAN, coming from the BFP11, part of the distribution of this age can be explained by the late inclusion and monitoring of children by the program. https://doi.org/10.11606/s1518-8787.2023057004655